AITopics | maximum score

Collaborating Authors

maximum score

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Block Sparse Flash Attention

Ohayon, Daniel, Lamprecht, Itay, Hubara, Itay, Cohen, Israel, Soudry, Daniel, Elata, Noam

arXiv.org Artificial IntelligenceDec-9-2025

Modern large language models increasingly require long contexts for reasoning and multi-document tasks, but attention's quadratic complexity creates a severe computational bottleneck. We present Block-Sparse FlashAttention (BSFA), a drop-in replacement that accelerates long-context inference while preserving model quality. Unlike methods that predict importance before computing scores, BSFA computes exact query-key similarities to select the top-k most important value blocks for each query. By comparing per-block maximum scores against calibrated thresholds, we skip approximately 50% of the computation and memory transfers for pruned blocks. Our training-free approach requires only a one-time threshold calibration on a small dataset to learn the per-layer and per-head attention score distributions. We provide a CUDA kernel implementation that can be used as a drop-in replacement for FlashAttention. On Llama-3.1-8B, BSFA achieves up to 1.10x speedup on real-world reasoning benchmarks and up to 1.24x for needle-in-a-haystack retrieval tasks while maintaining above 99% baseline accuracy, with certain configurations even improving accuracy by focusing on the most relevant content, substantially outperforming existing sparse attention methods. The implementation is available at https://github.com/Danielohayon/Block-Sparse-Flash-Attention

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2512.07011

Country: Asia > Middle East > Israel (0.28)

Genre: Research Report (0.85)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

ConSmax: Hardware-Friendly Alternative Softmax with Learnable Parameters

Liu, Shiwei, Tao, Guanchen, Zou, Yifei, Chow, Derek, Fan, Zichen, Lei, Kauna, Pan, Bangfei, Sylvester, Dennis, Kielian, Gregory, Saligane, Mehdi

arXiv.org Artificial IntelligenceFeb-20-2024

The self-attention mechanism sets transformer-based large language model (LLM) apart from the convolutional and recurrent neural networks. Despite the performance improvement, achieving real-time LLM inference on silicon is challenging due to the extensively used Softmax in self-attention. Apart from the non-linearity, the low arithmetic intensity greatly reduces the processing parallelism, which becomes the bottleneck especially when dealing with a longer context. To address this challenge, we propose Constant Softmax (ConSmax), a software-hardware co-design as an efficient Softmax alternative. ConSmax employs differentiable normalization parameters to remove the maximum searching and denominator summation in Softmax. It allows for massive parallelization while performing the critical tasks of Softmax. In addition, a scalable ConSmax hardware utilizing a bitwidth-split look-up table (LUT) can produce lossless non-linear operation and support mix-precision computing. It further facilitates efficient LLM inference. Experimental results show that ConSmax achieves a minuscule power consumption of 0.43 mW and area of 0.001 mm2 at 1-GHz working frequency and 22-nm CMOS technology. Compared to state-of-the-art Softmax hardware, ConSmax results in 14.5x energy and 14.0x area savings with a comparable accuracy on a GPT-2 model and the WikiText103 dataset.

consmax, hardware, softmax, (16 more...)

arXiv.org Artificial Intelligence

2402.1093

Country:

North America > United States > Michigan (0.04)
Europe > Lithuania > Kaunas County > Kaunas (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Language Decision Transformers with Exponential Tilt for Interactive Text Environments

Gontier, Nicolas, Rodriguez, Pau, Laradji, Issam, Vazquez, David, Pal, Christopher

arXiv.org Artificial IntelligenceNov-17-2023

Text-based game environments are challenging because agents must deal with long sequences of text, execute compositional actions using text and learn from sparse rewards. We address these challenges by proposing Language Decision Transformers (LDTs), a framework that is based on transformer language models and decision transformers (DTs). Our LDTs extend DTs with 3 components: (1) exponential tilt to guide the agent towards high obtainable goals, (2) novel goal conditioning methods yielding better results than the traditional return-to-go (sum of all future rewards), and (3) a model of future observations that improves agent performance. LDTs are the first to address offline RL with DTs on these challenging games. Our experiments show that LDTs achieve the highest scores among many different types of agents on some of the most challenging Jericho games, such as Enchanter.

agent, goal condition, trajectory, (17 more...)

arXiv.org Artificial Intelligence

2302.05507

Country:

North America > Canada > Quebec > Montreal (0.14)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

A Hybrid Evolutionary Approach to Solve University Course Allocation Problem

Dofadar, Dibyo Fabian, Khan, Riyo Hayat, Hasan, Shafqat, Taj, Towshik Anam, Shakil, Arif, Majumdar, Mahbub

arXiv.org Artificial IntelligenceJul-24-2023

This paper discusses various types of constraints, difficulties and solutions to overcome the challenges regarding university course allocation problem. A hybrid evolutionary algorithm has been defined combining Local Repair Algorithm and Modified Genetic Algorithm to generate the best course assignment. After analyzing the collected dataset, all the necessary constraints were formulated. These constraints manage to cover the aspects needed to be kept in mind while preparing clash free and efficient class schedules for every faculty member. The goal is to generate an optimized solution which will fulfill those constraints while maintaining time efficiency and also reduce the workload of handling this task manually. The proposed algorithm was compared with some base level optimization algorithms to show the better efficiency in terms of accuracy and time.

artificial intelligence, evolutionary algorithm, machine learning, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/AIBT53261.2021.00015

2212.0223

Country: Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Education > Educational Setting > Higher Education (0.87)
Education > Curriculum (0.62)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Add feedback

Data Leakage

#artificialintelligenceDec-15-2021, 01:45:09 GMT

If the process of standardizing numeric data is prone to leakage, then why can't it be skipped? Equal Feature Importance -- Let's say we have two features: final_exam_score and SAT_score [USA college prep test]. On one hand, the final exam has a maximum score of 100, but, on the other hand, the SAT has a maximum score of 1600. If we don't normalize these two features based on their range of possible values, then an algorithm would initially be prone to prioritizing the SAT_score feature because of its larger values. However, if we normalize both features between 0 and 1, then they will be treated equally at the start of training. Help Prevent Gradient Explosion -- Neural networks learn better when input values are close to zero.

data leakage, maximum score, output value, (1 more...)

#artificialintelligence

Country: North America > United States (0.29)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.43)

Add feedback

Introduction to Reinforcement Learning

#artificialintelligenceAug-17-2021, 08:45:18 GMT

The idea of CartPole is that there is a pole standing up on top of a cart. The goal is to balance this pole by moving the cart from side to side to keep the stick balanced upright. We consider the environment won if we balance it for 500 frames and fail once the pole is tilted more than 15 degrees from totally vertical or the cart moves more than 2.4 units from the middle position. For every frame that we go with the pole "balanced" (less than 15 degrees from vertical), our "score" gets 1, and our target is a score of 500. Now, however, how can we do this?

agent, algorithm, information, (14 more...)

#artificialintelligence

Genre: Instructional Material (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Reading and Acting while Blindfolded: The Need for Semantics in Text Game Agents

Yao, Shunyu, Narasimhan, Karthik, Hausknecht, Matthew

arXiv.org Artificial IntelligenceMar-24-2021

Text-based games simulate worlds and interact with players using natural language. Recent work has used them as a testbed for autonomous language-understanding agents, with the motivation being that understanding the meanings of words or semantics is a key component of how humans understand, reason, and act in these worlds. However, it remains unclear to what extent artificial agents utilize semantic understanding of the text. To this end, we perform experiments to systematically reduce the amount of semantic information available to a learning agent. Surprisingly, we find that an agent is capable of achieving high scores even in the complete absence of language semantics, indicating that the currently popular experimental setup and models may be poorly designed to understand and leverage game texts. To remedy this deficiency, we propose an inverse dynamics decoder to regularize the representation space and encourage exploration, which shows improved performance on several games including Zork I. We discuss the implications of our findings for designing future agents with stronger semantic understanding.

agent, drrn, representation, (14 more...)

arXiv.org Artificial Intelligence

2103.13552

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > New York > New York County > New York City (0.04)
Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)

Genre: Research Report > New Finding (0.49)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Add feedback

Let's Play Again: Variability of Deep Reinforcement Learning Agents in Atari Environments

Clary, Kaleigh, Tosch, Emma, Foley, John, Jensen, David

arXiv.org Artificial IntelligenceApr-12-2019

Reproducibility in reinforcement learning is challenging: uncontrolled stochasticity from many sources, such as the learning algorithm, the learned policy, and the environment itself have led researchers to report the performance of learned agents using aggregate metrics of performance over multiple random seeds for a single environment. Unfortunately, there are still pernicious sources of variability in reinforcement learning agents that make reporting common summary statistics an unsound metric for performance. Our experiments demonstrate the variability of common agents used in the popular OpenAI Baselines repository. We make the case for reporting post-training agent performance as a distribution, rather than a point estimate.

machine learning, reinforcement learning, variability, (17 more...)

arXiv.org Artificial Intelligence

1904.06312

Country: North America > United States > Massachusetts > Hampshire County > Amherst (0.16)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

AI is trained to create incredibly British place names

Daily Mail - Science & techJul-24-2017, 21:02:02 GMT

From Pratt's Bottom to Giggleswick, Britain is well-known for its unusual place names. But there is now an artificial intelligence programme that can generate its own. Oregon-based programmer Dan Hon, who created the programme for fun, posted a new list of British place names created by the AI on Twitter. The programme analysed thousands of British names of towns and villages and was then trained to make new ones at random. It created almost 4,500 'incredibly British' place names in total, and from'Filton-on's Forton' to'Grinachard St Ringley', many sound just like real places.

machine learning, natural language, place name, (17 more...)

Daily Mail - Science & tech

Country:

Europe > United Kingdom (0.27)
North America > United States > Oregon (0.25)
North America > Canada (0.07)

Technology:

Information Technology > Communications > Social Media (0.59)
Information Technology > Artificial Intelligence > Machine Learning (0.54)
Information Technology > Artificial Intelligence > Natural Language (0.36)

Add feedback

Microsoft AI gets maximum score possible on Ms. Pac-Man

#artificialintelligenceJun-15-2017, 07:30:07 GMT

Humans are now second-best at playing Ms. Pac-Man, a 1980s twist on the arcade classic, involving eating pellets and being chased by ghosts. It was rated as one of the hardest games for an AI to beat, but that didn't stop one. An AI from Microsoft's Maluuba team -- a Canadian deep learning startup the company acquired earlier this year -- has now scored the maximum score possible of 999,990 in the Atari game, beating the human record by four times. This was achieved using a method of reinforcement learning called Hybrid Reward Architecture. The team taught 150 AI agents to work together in parallel to master the game.

deep learning, machine learning, maximum score, (5 more...)

#artificialintelligence

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Games (0.78)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.62)

Add feedback